Q -Learning with Linear Function Approximation
نویسندگان
چکیده
In this paper, we analyze the convergence of Q-learning with linear function approximation. We identify a set of conditions that implies the convergence of this method with probability 1, when a fixed learning policy is used. We discuss the differences and similarities between our results and those obtained in several related works. We also discuss the applicability of this method when a changing policy is used. Finally, we describe the applicability of this approximate method in partially observable scenarios.
منابع مشابه
An Online Convergent Q-learning Algorithm with Linear Function Approximation
We present in this article a variant of Q-learning with linear function approximation that is based on two-timescale stochastic approximation. Whereas it is difficult to prove convergence of regular Q-learning with linear function approximation because of the off-policy problem, we prove that our algorithm is convergent. Numerical results on a multi-stage stochastic shortest path problem show t...
متن کاملOn the approximation by Chlodowsky type generalization of (p,q)-Bernstein operators
In the present article, we introduce Chlodowsky variant of $(p,q)$-Bernstein operators and compute the moments for these operators which are used in proving our main results. Further, we study some approximation properties of these new operators, which include the rate of convergence using usual modulus of continuity and also the rate of convergence when the function $f$ belongs to the class Li...
متن کاملQ-Networks for Binary Vector Actions
In this paper reinforcement learning with binary vector actions was investigated. We suggest an effective architecture of the neural networks for approximating an action-value function with binary vector actions. The proposed architecture approximates the action-value function by a linear function with respect to the action vector, but is still non-linear with respect to the state input. We sho...
متن کاملMetric entropy and sparse linear approximation of lq-hulls for 0<q≤1
Consider `q-hulls, 0 < q ≤ 1, from a dictionary of M functions in L space for 1 ≤ p < ∞. Their precise metric entropy orders are derived. Sparse linear approximation bounds are obtained to characterize the number of terms needed to achieve accurate approximation of the best function in a `q-hull that is closest to a target function. Furthermore, in the special case of p = 2, it is shown that a ...
متن کاملConvergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
We introduce the first temporal-difference learning algorithms that converge with smooth value function approximators, such as neural networks. Conventional temporal-difference (TD) methods, such as TD(λ), Q-learning and Sarsa have been used successfully with function approximation in many applications. However, it is well known that off-policy sampling, as well as nonlinear function approximat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007